Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

What

Fixes migration guide validation to handle MDX imports and YAML frontmatter that precede the main title. Currently, the validation expects the migration guide title to be on line 1, but valid guides may have metadata or imports before the title.

This addresses the failing test in PR #67077 where source-zendesk-support-migrations.md begins with an MDX import statement on line 1, causing validation to fail despite having a proper title on line 3.

How

  1. Added get_first_heading() helper function in helpers.py:

    • Parses markdown content line by line
    • Skips YAML frontmatter (content between --- markers)
    • Skips MDX import statements (lines starting with import )
    • Skips empty lines
    • Returns the first line starting with #
  2. Updated CheckMigrationGuide._run() in documentation.py:

    • Uses the new helper instead of checking only the first line
    • Adds separate error case for files with no headings at all
    • Preserves the existing requirement for correct migration guide title format

Review guide

  1. airbyte-ci/connectors/connectors_qa/src/connectors_qa/checks/documentation/helpers.py

    • Review the frontmatter state machine logic (lines 171-177)
    • Verify MDX import detection is sufficient (line 182)
    • Check edge case handling for malformed content
  2. airbyte-ci/connectors/connectors_qa/src/connectors_qa/checks/documentation/documentation.py

    • Verify the new validation logic flow (lines 55-70)
    • Check that error messages are appropriate
    • Ensure the change maintains the same validation requirements
  3. Testing considerations:

    • Verify this fixes the source-zendesk-support case
    • Spot check other migration guides to ensure no regressions
    • Consider testing edge cases like malformed frontmatter

User Impact

Positive: Migration guides with valid MDX imports or YAML frontmatter will now pass validation correctly, unblocking connector releases.

Potential risks: The parsing logic could have edge cases that incorrectly skip over actual titles, though this is unlikely given the simple line-by-line approach.

Can this PR be safely reverted and rolled back?

  • YES 💚

This change only makes validation more permissive for valid content patterns. Reverting would restore the original strict first-line validation without breaking existing functionality.


Link to Devin session: https://app.devin.ai/sessions/d66e3af38e1d4d5396c475c9215e827f

Requested by: Ian Alton ([email protected])

This change makes the migration guide validation more flexible by:
- Adding a get_first_heading() helper that skips MDX imports, YAML frontmatter, and empty lines
- Updating CheckMigrationGuide to use this helper instead of checking only the first line
- Preserving the requirement for a proper Migration Guide title

This fixes the validation failure for source-zendesk-support which has an MDX import on line 1.

Requested by: Ian Alton ([email protected])
Related PR: #67077

Co-Authored-By: [email protected] <[email protected]>
Copy link
Contributor Author

Original prompt from [email protected]
@Devin Take a look at the documentaton.py tests in airbyte: airbyte-ci/connectors/connectors_qa/src/connectors_qa/checks/documentation/documentation.py

There's a failing test here: <https://github.com/airbytehq/airbyte/actions/runs/18244805367/job/51951361039?pr=67077>

The reason the test is failing is because the migration guide begins with an MDX import.

I'd like to preserve the requirement to have a proper Migration Guide title. However, there are valid cases where Markdown metadata or imports will precede this, so this title should not necessarily be on the first line. Can you provide a suggestion for how this logic will be a bit more flexible?
Thread URL: https://airbytehq-team.slack.com/archives/D08FX8EC9L0/p1760036729536019?thread_ts=1760036729.536019

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Contributor

github-actions bot commented Oct 9, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Helpful Resources

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • /format-fix - Fixes most formatting issues.
  • /bump-version - Bumps connector versions.
    • You can specify a custom changelog by passing changelog. Example: /bump-version changelog="My cool update"
    • Leaving the changelog arg blank will auto-populate the changelog from the PR title.
  • /run-cat-tests - Runs legacy CAT tests (Connector Acceptance Tests)
  • /build-connector-images - Builds and publishes a pre-release docker image for the modified connector(s).
  • JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version type=patch changelog='foo' - Bump the Bulk CDK's version. type can be major/minor/patch.
  • Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.

📝 Edit this welcome message.

@ian-at-airbyte
Copy link
Contributor

ian-at-airbyte commented Oct 9, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (1724f67)

@ian-at-airbyte
Copy link
Contributor

ian-at-airbyte commented Oct 9, 2025

/run-connector-tests

Connector CI Tests Started

These tests will leverage Airbyte's integration test credentials.

Check job output.
✅ Connector CI Tests job completed successfully. See logs for details.

devin-ai-integration bot and others added 2 commits October 9, 2025 19:42
- Split test_fail_when_migration_guide_file_does_not_start_with_correct_header into two tests:
  - test_fail_when_migration_guide_file_is_empty: Tests empty file case
  - test_fail_when_migration_guide_file_does_not_start_with_correct_header: Tests wrong header case
- Add test_pass_when_migration_guide_has_mdx_imports_before_title: Verifies MDX imports are properly skipped before title validation

All tests now align with the updated get_first_heading() implementation that skips frontmatter, MDX imports, and empty lines before checking for the migration guide title.

Co-Authored-By: [email protected] <[email protected]>
@ian-at-airbyte
Copy link
Contributor

ian-at-airbyte commented Oct 9, 2025

(aside) The current connector test for migration guides is extremely strict. It does not permit the use of MDX imports or front matter. However, these are pretty standard in Docusaurus and it's better that we expect them.

Rather than trying to move the import around to satisfy the test, I've had Devin update the test and add a new helper function. It still requires the article start with a proper heading as before, but it's no longer necessary for that heading to be on line 1 if front matter is present in the topic.

I simulated a connector update with #67595 and all checks passed.

migration_guide_file_path=tmp_path / "migration_guide.md",
)
connector.migration_guide_file_path.write_text(
"import SpecialDoc from '@site/src/components/SpecialDoc';\n\n" "# Test Connector Migration Guide\n" "## Upgrading to 1.0.0\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our purposes here, does this need to be an actual functioning import with a real file? Or is a mock enough? IE I guess what I'm saying is if this were a real Docusaurus instance, this wouldn't build. But it's OK if we don't expect it to.

Copy link
Contributor Author

Hi Ian! Great question. A mock import is sufficient for this unit test.

The test is specifically validating the text parsing logic in get_first_heading(), which simply:

  • Skips lines that start with import (without validating the import itself)
  • Finds the first markdown heading after skipping frontmatter, imports, and empty lines

The validation doesn't attempt to resolve or validate the actual MDX imports - it just needs to recognize and skip the line pattern. The mock import import SpecialDoc from '@site/src/components/SpecialDoc'; is sufficient to test this behavior.

For real-world verification, I created PR #67595 which recreates changes to source-zendesk-support (which has a real MDX import in its migration guide). All CI checks passed, confirming the validation works correctly with actual MDX imports in production.

So to answer your question: the mock is fine for this unit test context, and we don't need the import to be buildable by Docusaurus. ✅

@DanyloGL DanyloGL moved this from New PRs to Waiting Eng Team in 🧑‍🏭 Community Pull Requests Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Waiting Eng Team

Development

Successfully merging this pull request may close these issues.

2 participants